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Abstract 



In multiple choice items the response probability on an xtem 
may be viewed as the result of two distinct latent processes, 
A cognitive process to solve the problem and another random 
process that leads to the choice of a certain alternative. An 
incomplete latent class model is formulated that describe the 
first process by a Rasch model and the second process by a 
guessing model. 

Alternative models are be specified that contain 
additional parameters describing differential item 
functioning (DIF) in the two processes. 

DIF with respect to either known or unknown subgroups 
can be tested by <i likelihood ratio test that is 
asymptotically distributed as chi-square. 

Key words: differential item functioning, multiple choice 
items, Rasch model, guessing model, incomplete 
latent class model, goodness of fit testing 
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Differential Item Functioning 
in Multiple Choice Items 

Items in educational or psychological tests show differential 
item performance (DIF) if the probability of a correct 
response among equally able test takers is different between 
racial, ethnic, or other subgroups, DIF may lead to tests 
that are unfair for certain subgroups, and it is important to 
spot such items so that they can be improved or deleted from 
the test. 

Many DIF detection methods have been proposed since 
Binet and Simon (1916, see also Jensen, 1980, p. 367) were 
the first to draw attention to this problem. Reviews of older 
DIF (also called item bias) detection methods are given by 
Osterlind (1983) and Shepard, Camilli and Averill (1981). 
Handbooks on item bias detection methods ar<* provided by Berk 
(1982) and Jensen (1980). 

In the last decade methods have been improved by giving 
better possibilities to match on ability. Various methods 
have used the number correct sccre of the test for this 
purpose (Camilli, 1979; Holland 4 Thayer, 1986; Kok, 
Mellenbergh, & van der Flier, 1985; Mellenbergh, 1982; 
Nungester, 1977 (see Ironson 1982); Scheunsman, 1979). 

Recently, DI* detection methods have been proposed that 
are based on item response theory (IRT) (Durovic, 1975; 
Fischer & Formann, 1982; Lord, 1980; Mislevy, 1981; Muth6n & 
Lehman, 1995; Wright, Mead & Draba, 1975). An IRT model 
explains the probability of an item response on the basis of 
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a person parameter and one or more item parameters . 
Differences between estimated item parameters across 
subgroups are considered as an indication of DIF. Thissen, 
Steinberg and Wainer (1889b) give an overview of IRT-based 
DIF detection methods and demonstrate their use. They also 
discuss DIF detection methods that can be used with multiple 
choice items. 

The fact that in multiple choice items response 
alternatives are given introduces new potential sources of 
DIF. Green, Crone and Folk (1989) focus on differential 
popularity of the incorrect responses (or "distractors") , If 
a particular distractor is more attractive to subjects from 
one subgroup than for another, Green et al. conjecture that 
"...the item probably means something different to the 
different groups". They perform loglinear analysis of the 
subgroup x score group x incorrect response contingency table 
for each item, to detect distractors that are more popular in 
one subgroup than in another. 

Another source of DIF in multiple choice items does not 
involve the popularity of the distractors, but concerns 
differential difficulty of the problem to be solved. Just as 
in other types of items, an item mav po»c a problem that is 
more difficult to some subjects t'lan to others, even if they 
are equally able on the trait of interest. In this paper an 
item bias detection model is described that separates both 
sources of bias • 

In the model it is assumed that the subject' s response 
to a certain item depends on two distinct processes. The 
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first process determines whether an individual with a certain 
ability solves the problem that is presented by the item, the 
second process determines the actual response given. 

Furthermore, we assume that if the subject solves the 
problem, (s)he will give the correct answer. Here the 
probability that the subject solves the problem is assumed to 
be governed by a Rasch (1960) model. If the subject cannot 
solve the problem the subject will guess the answer, where 
the guessing probabilities may be different for different 
alternatives. 

The r.odel differs from that of Thissen, Steinberg and 
Fitzpatrick (1989a), who distinguish between a "Don't know" 
state and a state in which the subject has partial or 
complete knowledge of the answer. In the "Don't know" state 
he guesses the answer as before, but in the "Partial 
knowledge" state the subject may answer a response 
alternative, where the response probabilities are governed by 
Bock's (1972) nominal response model. 

The proposed model is simpler than the model by Thissen 
et al (1989a). Thib has two advantages. Firstly, it contains 
less parameters. For example, in a four chcice item, our 
model has fiv> item parameters while Tl^ssen's model has 
fourteen. Obviously, if the sample is not very large the 
parameters in the latter model cannot be estimated reliable. 
So, in that case one may be inclined to "Buy information by 
assumption" and use the simpler model. Secondly, the proposed 
model can easily be formulated as a latent class analysis 
(LCA) model. LCA models have been introduced by Lazarsfeld 
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(1950; see also Lazarsfeld & Henry, 1968) and developed 
further by Goodman (1973), Haberman (1979), Clogg (1981) and 
others. LCA models have been used extensively for measurement 
in sociology, psychology and education, Formann (1985), 
Kelderman (1988, 1989), Kelderman and Macready (1988) and 
Mislevy and Verhelst (1987) and Yamamoto, \1987, 1988) 
integrated IRT models into LCA models. There is a well- 
developed theory for maximum-likelihood estimation and 
likelihood-ratio testing of LCA models. By comparing the fit 
of different latent class models, DIF in the guessing 
probabilities and DIF in the parameters of the Rasch model 
can be tested separately. Also, the model can be extended 
with latent classes, so that the subgroups for which the 
items exhibits DIF may be latent too. 

In what follows the model for multiple choice items is 
developed and formulated as a LCA model. Different models for 
the detection of DIF are formulated. Also a model with latent 
subgroup variable is discussed. A computationally efficient 
estimation method is described and its use is illustrated 
using empirical data. 
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A Model for Multiple Choice Items 

Suppose that each subject, randomly drawn from a population 
of N subjects, respond to k test items, where his/her answer 
to item j may be any of rj responses yj (yj=l, . . . , r j) . The 
response pattern of this subject on the test items is denoted 

10 
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by the vector y= (y^, . . . , y^) . The corresponding random 
variables are denoted by capital letters Yj (j=l,...,k) and 
Y. Let Xj indicate the latent response of the subject, taking 
values Xj«l if (s)he solved the problem or Xj=0 if (s)he did 
not solve the problem posed by item j. And let x-U^, . . . ^x^) 
be the vector of these values . The corresponding random 
variables are denoted by Xj and X. 

The relationship between the latent responses Xj and the 
observed responses yj is described by the conditional 
probability 

(1) <l£j*j m P(yjlxj) 

where the superscripts are symbolic notation indicating that 
the random variables Xj and Yj are involved in the 
conditional probability. For the sake of simplicity, the 
notations yj, Xj, etc. in the probabilities are used for 

Y j =y j' X j =x j' etc * 

It is assumed that if the subject can solve the problem, 

(s)he chooses the correct alternative, that .Is *^y^ must 

equal to 1 if yj is th« right alternative. 

Assuming that yj depends on Xj only, we have 

k k 

(2) P(ylx, 9) = n P( yi lx4) = n «Jj*j 

j-1 J J j-1 x j y j 

where 8 is the latent ability value. 
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The latent responses are assumed to be governed by an 
one-parameter-logistic model (Rasch, 1960) , where the 
probability of the latent response Xj given that the subject 
has ability 9 is 

(3) PUjlS) = exp(xj(9-6j))/(l + exp(6-6j)) 

and 6j is the difficulty of item j. 

Assuming that Xj depends only on the latent ability 9 we 

have 

k k 
(. 1 P(xl9) - n P(x.il9) = exp(t9 - n xa6a) C(9,8)"' :l 
j-1 j-1 J J 

with 

k 

C(9,8) = n (1 + exp(9-64)) 
1-1 J 

where 8= (8^, . . . ,8^) , and t=x^+...+xj c is the number correct 
score. 

Let F(9) be the continuous distribution function of the 
latent ability 9. Using (2), and (4) the marginal probability 
of the observed responses y then becomes 
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exp(t9)C(e # 5)" 1 dF(e) . 



In the next section we will formulate this model as an 
incomplete latent class model. The integral in model (5) will 
then be absorbed into a latent class parameter which depends 
only on the number correct score t. This means that it is not 
needed to specify the distribution fun* on F(8) any further. 

To detect DIF in multiple choice items, model (5) has to 
be extended with subgroups. In order to keep the main idea of 
this section clear the subgroups have been ignored so far. In 
the third section we will extend the incomplete latent class 
model with the subgroups. 



Kelderman (1988) has showed that model (5) is an incomplete 
latent-class model in the sense of Haberman (1979, ch. 10) 



An Incomplete-Latent-Class Model 



P (y , - I fljgk 



with 
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00 

«{ = j exp(te)C(e,8)- 1 dF(6), 



= exp(-XjSj) j = 1 k, 



and where the O-parameters are subject to the restrictions 



(7) 0$J - 1, j = 1 k, 



< 8 > *x^ j+ --- + <\l\ ml ' 



j — 1 f • • • i k/ 



In this model each value of x represents a latent class. 
Model (6) is incomplete because for certain given values of X 
only a limited number of combinations ( v i,...,Y] C ) are 
possible. Because of the fact that depends on an 

underlying latent trait distribution F(0), these parameters 
are subject to the following complex inequality constraints 
(Cressie & Holland, 1983; Kelderman, 1984): 



det.<l«£ +s ll?l s=0 ) .0 



and 



«*•<■ «?♦,♦! >??..0 » * 0 
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where 



31 = 



k/2 

(k-l)/2 



if k is even, 
if k is odd, 



q 2 = 



(k-2)/2 
(k-l)/2 



if k is even, 
if k is odd, 



det.(l - 1^ s= q) means the determinant ot a matrix with row 
index r and column index s both running from zero to q. 

Since it is not our goal to fit a model for the data, 
but to decide if a certain item exhibits DIF, we will follow 
Cressie and Holland and ignore these inequality constraints. 
This, tne so called c^nera 1 ized Rasch model, provides an easy 
way to decide that an item exhibits DIF . The generalized 
Rasch model is also equivalent to the "conditional " Rasch 
model. That is, a Rasch model in which there is a 
conditioning on the number correct score (Kelderman, 1984), 

Incomplete table methodology can be used to formulate 
several hypotheses about DIF by specifying alternative models 
that contain additional subgroup-dependent parameters. 



Parameters describing DIF 

An item can show DIF in two different ways. First, as 
indicated before, the item exhibits DIF if equally able 
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individuals from different subgroups have different 
probabilities of solving the problem that the item poses. 
This will be called DIF in the latent response. 

It was assumed earlier that if the subject can solve the 
problem (s)he wi'l choose the correct alternative. But if the 
subject can't solve the problem, (s)he would guess the most 
attractive alternative. Therefore, the item exhibits also DIF 
if the attractiveness of the alternatives varies from 
subgroup to subgroup. This will be called DIF in the guessing 
probabilities . 

In most applications subgroup membership (e.g., sex) is 
known. In seme situations, however, items are expected to 
e v hibit DIF with respect to certain subgroups, but it is not 
known to which subgroup each of the individuals belongs. 

In the following models are formulated for studying the 
two types of DIF, i.e, both for DIF in the latent response 
and DIF in the guessing probata i ities. Further, the cases 
that the subgroup i (i=l,...,g) is observed or that it is not 
observed are considered. 

DIF in the Latent Response 

To detect DIF with respect to the process of solving the 
problem, an alternative model is formulated as 




where P(yli) is the conditional distribution of observed 
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response y given observed subgroup i, <&f£j 35 exp i-xi^n) , 8^ 
is the difficulty of item 1 in subgroup i, and 

oo 

<nj£ = J exp(te)C(e,5)- 1 dF i (9) 

— oo 

where F±(Q) is the distribution of the latent trait in 
subgroup i . 

To test whether the interaction between subgroup i and 
the latent response to item 1 is zero, i.e., item 1 exhibits 
DIF in the latent response, this alternative model is 
compared with the model 

(10) P(yli) = I «?J 4{l ... 4{Jc fljljl ... flMc 

If the test is significant, it may be concluded that the 
difficulty of item 1 varies from subgroup to subgroup. In 
this case the subjects in one subgroup may find it more 
difficult to solve the problem than subjects from another 
subgroup. 

DIF in the Gnftsaina Probabilities 

To test the null hypothesis that the interaction between 
the subgroup and the observed response to item 1 is zero, 
i.e. item 1 exhibits DIF in the guessing probabilities, the 
alternative model 
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(11) P(yli) = I «?? <l£l ... <jfo <I>f x l Y l fl*2 Y 2 ... <pk*k 
x it x x xjt ix iyi x 2 y 2 x ky]c 

where P(yli) is the conditional distribution of observed 
response y given observed subgroup i and 

- P(ylx,i) is 

the conditional probability of observed response y given 
latent response x and observed subgroup i, is compared with 
model (10) . If the test is significant, it may be concluded 
that the attractiveness of the alternatives of item 1 varies 
from subgroup to subgroup. 

In model (9) and model (11) the O-terms are specified to 
test DIF for only one item. Obviously, similar model terms 
can be specified for two or more items if necessary. It is 
also possible to analyse models in which one item exhibits 
DIF in the latent response and another (or the same) item DIF 
in the guessing probabilities. 

Latent Subgroup Models 

When subgroup membership is unobserved, the subgroup 
variable I becomes also a latent variable. And the models for 
the detection of DIF are -till latent-class models. Models 
with unobserved subgroups are very useful in situations where 
grouping information is not available, or when it is not 
desirable to link the concept of DIF to any specific manifest 
variable. 

Unlike the models in (9) to (11), the models wich 
unobserved subgroups are not always identified. For example, 
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a model with unobserved subgroups in which only one item 
exhibits DIF in the latent response, is not identified. In 
order to overcome this problem, models can be specified to 
test DIF for v items (2£v<k) . The models (9) to (11) then 
become 

(12) P(y) =11 rfj *PJl ... rfjv fl£v+l ... 

i x 1 v x v+l 

flfrc 4^1 Y 1 ... <&k Y k, 
*)c *lYl *kYk' 

(13) P(y) =11 *JJ «£l ... rfljl ... rfkjk , 

i x c 1 x * x lVl x kV}c 

and 

Hi) P(y) - I z «?J ofi ... <l£k o^l Y l ... 

i x " x i x k ix iyi 

<jf x v Y v flpv+l Y v+l apk Y k 
ix v y v x v+1 y v+1 x k y k 

where <l£l * = exp^x^^), 8^ is the difficulty of item 1 in 
latent subgroup i, and = P(ylx,i) is the conditional 

distribution of observed response y given latent response x 
and latent subgroup i. 

Just as in the case of observed subgroups, it is also 
possible to analyse models in which some items exhibit DIF in 
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the latent response and other (or the same) items DIF in the 
guessing probabilities. 



Parameter Estimation and Model Testing 

Let n^jjy be the number of individuals in subgroup i with X=x 
and Y=y under a certain model and let mi X y = N P(i,x,y) be 
the expected value of n^y. Although n^ X y is not observed, it 
is possible to estimate the means m^ X y of n^ X y, and the <D- 
parameters from the observed n^y (or ny if the subgroup is 
unobserved) by the method of maximum likelihood. To 
illustrate this, consider the model defined by 

as, .u,-N«g«gi 

The maximum likelihood equations for model (15) would be 
(Haberman, 1979) : 



*IT _ aIT A 1 *-!^ _ AIX4Y4 , _ - . 

m it - n it , m ix D y D - n ix 3 y 3 > j = 1 k 



where 



(16) fi ixy = ( m ixy / ) n» , 



IT IX4Y4 

and where n it and n i X jyj are the num *> ers of individuals in 
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subgroup i with T=t, Xj=Xj and *j=yj* respectively. Further 

m" and m**j*j are the expected values of n*J and nJ X j Y j. 

At j v j " j v j 

If the subgroup i is not observed, then n iy and m iy in (16) 

has to be replaced by n y and my. 

The equations can be solved by the iterative 
proportional fitting algorithm or the scoring algorithm 
(Goodman, 1978; Haberman, 1979) . The iterative proportional 
fitting algorithm is to be preferred, since it is less 
sensitive to the choice of starting values. 

In model (15) all ite:as were considered to exhibit DIF 

in the latent response and DIF in the guessing probabilities. 

If some items exhibit no DIF in the latent response or DIF in 

the guessing probabilities, then the ^parameters for these 

items are restricted. For example, if in a certain model item 

1 exhibits no DIF in the latent response, then the *? X 1- 

ixi 

parameter is restricted in the following manner 

i*i " ' " *gxi 

Similar estimation equations can be formulated for restricted 
models. 

The overall goodness of fit of an incomplete latent- 
class model can be tested by the Pearson statistic (Q) or the 
likelihood-ratio statistic (LR) (see Haberman, 1979) . Both 
statistics are asymptotically distributed as chi-square with 
degrees of freedom equal to the difference between the number 
of count n y (or n^ y if the subgroup is observed) and the 
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number of estimable parameters. The number of estimable 
parameters of a model should be equal to the rank of the 
information matrix (cf McHugh, 1956; Goodman/ 1978) . 

By the difference in likelihood-ratio test statistics of 
both models (LR(a;b)) it can be tested whether the 
alternative model (b) yields a significant improvement in fit 
over the compact model (a) , which is a special case of model 
(b) . Under the assumption of model (a), LR(a;b) is 
asymptotically chi-square distributed with degrees of freedom 
equal to the difference in numbers of estimable parameters of 
both models (Bishop, Fienberg & Holland, 1975) • 



An Empirical Example 

As an example four items from the Second International 
Mathematics Study in the Netherlands were considered, (Eggen, 
Pelgrum & Plcmp, 1987). Each item was a five-choice item with 
only one correct alternative. 

A sample of 3002 students from two schooltypes of lower 
secondary education in the Netherlands representing the whole 
ability range was drawn. To illustrate the use of quasi- 
loglinear models for detection of DIF, the students level of 
education was chosen as grouping variable : subgroup MAVO 
(intermediate general education) and subgroup HAVO/VWO 
(higher general education and pre-university education) . 

The models (9) and (11) were fitted to the data using 
the computer-program LCAG (Hagenaars & Lui jkx, 1987) . LCAG is 
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a program for estimating the parameters of loglinear models 
with latent variables. LCAG yields, besides the estimated 
latent conditional probabilities (i.e. the guessing 
probabilities), the estimated expected frequency distribution 
of the latent variables under the model. From this frequency 
distribution the difficulty parameters were estimated using 
LOGIMO (Kelderman & Steen, 1988) . LOGIMO is a general 
computer program especially written to analyse loglinear irt 
models. 

DIF is nested by comparing model (9) (for DIF in the 
latent response) and model (11) (for DIF in the guessing 
probabilities) with model (10) (no DIF). In Table 1 for each 
item the values of the likelihood ratio test and the degrees 
of freedom are shown for models (9) and (11) . In both cases 
the level of education was observed. 



Insert Table 1 about here 



From Table 1 it may be concluded that, except for item 
2, the difficulty to solve the problems represented by the 
items does not vary significantly between the subgroups MAVO 
and HAVO/VWO. In Table 2 the difficulty parameters of the 
four items in the model, in which item 2 exhibit DIF in the 
latent response, are given. 
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Insert Table 2 about here 



It can be seen from Table 2 that the difficulty of item 
2 was substantially smaller for MAVO-students then for 
HAVO/VWO-students . 

Table 1 also shows that the attractiveness of the 
alternatives of the items 1, 2, and 4 were significantly 
different in both subgroups. To give a more detailed 
interpretation of the attractiveness of the alternatives, the 
guessing probabilities of the alternatives for each item are 
presented in Tabic 3. 



Insert Table 3 about here 



For a HAVO/VWO-student the correct alternative of item 1 
is more attractive then for a MAVO-student . So (s)he is more 
inclined to choose the correct alternative. On the other hand 
a MAVO-student would be more inclined to choose the correct 
alternative of item 2, because his/her guessing probability 
of the correct alternative is twice as big as the guessing 
probability for a HAVO/VWO-student. However, for both 
subgroups the correct alternative is not the most attractive 
alternative. 

The guessing probabilities for the correct alternative 
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of item 4 are almost the same for both subgroups , but for the 
alternatives B and C there is a curious different between the 
t*o subgroups, A HAVO/VWO-student would guess alternative B 
with almost the same probability as a MAVO-student would 
guess alternative C and guessing alternative C with almost 
the same probability as a MAVO-student would guess 
alternative B. 

Item 3 exhibits no DIF in the guessing probabilities. 
However, alternatives B and D of item 3 have a relatively 
large attractiveness. 

In the foregoing the two types of DIF were studied 
separately from each other. Also on.'y one item at the time 
was studied. As was indicated earlier, it is also possible to 
analyse models in which more than one item exhibits DIF. To 
illustrate this possibility model M, in which the items 1, 2, 
and 4 exhibits DIF in the guessing probabilities and where 
item 2. exhibit DIF in the latent response, was considered. 
Model M gives a considerably improvement in fit to the data 
over model (10) (LR(10;M) = 100.5; DF = 13). From Table 2 it 
also follows that model M fits the data better than the 
models discussed before. The parameters, however, do not 
differ much from the parameters of the previous models. 
Therefore they are not given. 

In summary, the difficulty of the four items can be 
ordered in the following way 8 3 > 5 X > 8 4 > 8 2 . That is, item 
2 is the easiest item and item 3 is the most difficult one. 
The attractiveness of alternatives 1, 2, and 4 as well as the 
difficulty of solving item 2 is not the same for the two 
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subgroups. Item 3 exhibits no DIF in the latent response or 
DIF in the guessing probabilities. 

Discussion 

In the present paper a model for multiple choice items 
is proposed, which views the observed response of a subject 
to a certain item as a result of two distinct processes. The 
first process consists of solving the problem and the second 
process of giving the actual response. This model is extended 
with subgroups (observed or latent) in order to study DIF in 
the two processes. The model was illustrated with an example. 

In this paper all tests of DIF are two-sided. This 
means, that it is not possible to test directional hypothesis 
about DIF. The estimated difficulty parameters and the 
estimated guessing probabilities provides only an indication 
for the direction of DIF. 

Because of the fact that LCAG claims much mem jry-space, 
it was not possible to consider more t*>an four five-choice 
\ terns. A line ot further research will be to find an 
estimation method that overcomes this problem. Further 
research should also give an answer to the question if a 
certain model is identified or not. 
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Table 1 

Likelihood ratio Tests for detecting DIF on the da ta of the 
Second International Mathematics Study 



Item (s) 


LRUO/9) 


DF 


LRdO/ll) DF 


1 


1.701 


1 


26.519* 4 


2 


4.720* 


1 


21.340* 4 


3 


1.747 


1 


6.033 4 


4 


.018 


1 


52.595* 4 


HOtfi. Tests 


marked with an 


asterisk 


are significant. 



(a = .05) 
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Table 2 

Difficulty parameters of the items in the mo del of DIF in the 
latent response in item 2 







Item 


i 




Subgroup 


1 


2 


3 


4 


HAVO/VWO 


1.52 


~ .82 


3.54 


-1.32 


MAVC 


1.52 


-1.90 


3.54 


-1.32 


Note. The difficulty 


parameters 


of items 1, 


3 and 


4 for MAVO 



are set equal to the difficulty parameters for HAVO/VWO. 
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Table 3 

Guessing probabilities of the alternatives of item t 



Alternatives 



Item 


A 


B 


C 


D 


E 


Subgroup HAVO/VWO 


1 


.073 


.033 


.585 


,:.74 


.035 


2 


.743 


■ 123 


.061 


.045 


.028 


3 


.112 


.327 


.139 


.323 


.099 


4 


.110 


.355 


.235 


.092 


.208 








Subgroup 


MAVO 




1 


• 211 


.024 


.563 


.193 


.009 


2 


.662 


.240 


.068 


.015 


.015 


3 


.112 


.327 


.139 


■ 323 


.099 


4 


.068 


.241 


.341 


.084 


.266 



Note 1. The correct alternatives are underlined. 

Note 2 . item 3 was not significantly biased in the guessing 

probabilities. 
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